An empirical study of high availability in stream processing systems

نویسندگان

  • Yu Gu
  • Zhe Zhang
  • Fan Ye
  • Hao Yang
  • Minkyong Kim
  • Hui Lei
  • Zhen Liu
چکیده

High availability (HA) is critical for many stream processing applications such as financial data analysis and disaster response. Existing HA schemes use either active standby or passive standby to guard the system against unexpected failures such as machine crash. Despite previous efforts of simulation-based studies that report active standby is superior, there is a lack of in-depth understanding of the tradeoff between different HA approaches under practical settings. In this paper, we propose a novel sweeping checkpointing method that can reduce the overhead by one order of magnitude. Whereas most previous work addresses single failures, we prove that the sweeping checkpointing method ensures no loss of data even against multiple concurrent failures. We then implement and compare the resulting passive standby variant against active standby using a real stream processing system. We find that passive standby presents a different tradeoff from active standby: longer recovery time, but 90% less overhead. Thus each approach has its suitable scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Stream-Oriented High-Availability Algorithms

Recently, significant efforts have focused on developing novel data-processing systems to support a new class of applications that commonly require sophisticated and timely processing of high-volume data streams. Early work in stream processing has primarily focused on streamoriented languages and resource-constrained, one-pass query-processing. High availability, an increasingly important goal...

متن کامل

Application of “Sink & Source” and “Stream wise” Methods for Exergy Analysis of Two MED Desalination Systems

Utilization of fossil fuel for supplying of requires energy of desalination systems is common. On the other hand, solar energy is one of the high-grade energies in the world that can be found specifically in hot weather places. Therefore, utilization of solar energy for operation of desalination systems will reduce greenhouse gases and is a good alternative way. Common exergy analysis method (s...

متن کامل

An efficient approach for availability analysis through fuzzy differential equations and particle swarm optimization

This article formulates a new technique for behavior analysis of systems through fuzzy Kolmogorov's differential equations and Particle Swarm Optimization. For handling the uncertainty in data, differential equations have been formulated by Markov modeling of system in fuzzy environment. First solution of these derived fuzzy Kolmogorov's differential equations has been found by Runge-Kutta four...

متن کامل

Key Technologies for Big Data Stream Computing

As a new trend for data-intensive computing, real-time stream computing is gaining significant attention in the Big Data era. In theory, stream computing is an effective way to support Big Data by providing extremely low-latency processing tools and massively parallel processing architectures in real-time data analysis. However, in most existing stream computing environments, how to efficiently...

متن کامل

An Empirical Study about Why Dissatisfaction Arises Among the Employees and What It Consequences: Bangladesh Perspective

This article aimed at identifying the rate of dissatisfied employees who had  left  their previous jobs and the main factors which caused their dissatisfaction. In order to collect data for this study a well-structured questionnaire was distributed to 150 employees of different private and public organizations in Bangladesh who already left their previous jobs and  142 usable responses were rec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009